rank | frequency | n-gram |
---|---|---|
1 | 18614 | -n |
2 | 12980 | -a |
3 | 10024 | -i |
4 | 8737 | -e |
5 | 7266 | -é |
rank | frequency | n-gram |
---|---|---|
1 | 9827 | -an |
2 | 5686 | -ng |
3 | 2637 | -un |
4 | 2630 | -né |
5 | 2569 | -en |
rank | frequency | n-gram |
---|---|---|
1 | 2127 | -pun |
2 | 2009 | -aké |
3 | 1897 | -ing |
4 | 1764 | -ang |
5 | 1578 | -ané |
rank | frequency | n-gram |
---|---|---|
1 | 2085 | -ipun |
2 | 1166 | -aken |
3 | 1094 | -ngan |
4 | 691 | -kaké |
5 | 476 | -kake |
rank | frequency | n-gram |
---|---|---|
1 | 1096 | -nipun |
2 | 533 | -angan |
3 | 270 | -kaken |
4 | 249 | -akaké |
5 | 215 | -aning |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings